More than 2 years ago, I decided to try to create a brief, digestible manual on the expectations of management for senior engineers at my company who are considering making the shift. At the time, I had about 3 years of
... (5555 chars. See body)
More than 2 years ago, I decided to try to create a brief, digestible manual on the expectations of management for senior engineers at my company who are considering making the shift. At the time, I had about 3 years of management experience, including two prior companies. Enough to feel like I knew how to do the job, but not enough to feel like I should be some kind of authority on management. After letting this marinate for a couple years, I’m ready to share what I have learned.
I have shared the principles below with people who are making the transition, and it has been received positively. Within the same span of time, my total number of people-months of management has dramatically increased. I have also been managing managers for more than a year, which has added additional dimension to my experience. Additionally, I have had the chance to interact with many more peers in management, who I’ve learned so much from.
I do things differently than managers I have worked for. That has sometimes made me question whether I was doing my job, at all. But I know now, from the successes my teams have achieved, that my approach works well for me, in the teams I’ve run, within the organizations we work within. All of this to say that while I don’t consider myself some kind of guru, I have gained confidence in my thinking.
The management principles below reflect lessons I have learned the hard way. These are things I wish I would have known and internalized when I first became a manager. By keeping this brief, I hope that it will be easily consumed, perhaps at the cost of people being able to fully understand each point. Maybe I will expand upon these over time.
Lastly, before I share my principles, I want to say that management and leadership are personal. Trying to be someone else is inauthentic, and as a leader, this is one of the easiest ways to sabotage yourself. These are the principles that work for me, but hopefully, there is something in here for others to adapt into their personal styles.
Managing comes first
Most managers are strong executors. They may inherit executional responsibilities, have the tendency to fall back on executing to solve problems, or want to let execution take priority. Managing comes first. Management can require large blocks of time, sometimes without warning, which means it’s crucial to stay out of the critical path of execution, wherever possible.
Facilitate wellbeing
Personal safety, dignity, and wellbeing of every team member are paramount. Team success is only success if team members feel good about it. Be accountable for the culture, while empowering team members to build it. Make sure the quiet voices are heard.
Practice integrity
Keep confidential company and personal information in confidence. Be transparent with information you can share. Be transparent about when you cannot do something. Keep your commitments. Own up to your mistakes. The trust you build is the currency you spend when delivering difficult news or feedback. Represent your people to the company and the company to your people; sometimes, this will endear you to neither, but that’s the job.
Cultivate relationships
As an IC, building relationships is useful. As a manager, it is essential. Find ways to connect with your reports. Go the extra mile. Build connections with diverse people outside your chain of command. Serve up connections to other people.
Be the example
Everyone looks to manager behavior to see what is valued by the company, in reality. That includes what you do and what you do not do. Being a manager means having to “be the adult in the room”. Be someone who lowers the temperature of stressful situations and conflict. Be proactive when challenging situations arise, rather than reacting after they’ve grown into fires.
Be informed
Information is the main tool of the manager and one of the most valuable things they can provide to both their reports and their own management chain. Have the pulse of the people around you. Know what’s happening, what may happen, and why.
Give credit, take blame
Make it safe for individuals to take initiative. When your team succeeds, distribute credit and deflect personal acclaim. Take accountability for failure. Represent your team members in places they can’t represent themselves.
Optimize work distribution
Managers have a portfolio of work that the business needs and people with work preferences. Optimize the dual objectives of delivering value to the organization and giving individuals problems that build their skillset, impact, satisfaction, and/or advancement. Performance is contextual; set people up to shine.
Lead by influence, not fiat
Having to command people to do things is a last resort, and usually indicates a failure to lead by informing and influencing. Be aware that because of power differential, manager suggestions are interpreted as commands. Coach, don’t command.
Distribute problems, not solutions
Managers with excellent execution skills and deep domain knowledge must resist the urge to present solutions to their reports. Reports learn by discovering solutions. Create safe learning opportunities. Create time for learning to occur. Give constraints, and justifications for the constraints.
Work on yourself
Investments in yourself are magnified in how effectively you can help your reports. Self-awareness is critical; your blind spots and rough corners can hurt other people. Always be reflecting. Always be learning.
Two phase locking (2PL) was the first of the general-purpose Concurrency Controls to be invented which provided Serializability. In fact, 2PL gives more than Serializability, it gives Opacity, a much stronger isolation l
... (12997 chars. See body)
Two phase locking (2PL) was the first of the general-purpose Concurrency Controls to be invented which provided Serializability. In fact, 2PL gives more than Serializability, it gives Opacity, a much stronger isolation level. 2PL was published in 1976, which incidentally is the year I was born, and it is likely that Jim Gray and his buddies had this idea long before it was published, which means 2PL first came to existence nearly 50 years ago.
After all that time has passed, is this the best we can do?
Turns out no, we can do better with 2PLSF, but let's start from the begining
When I use the term "general-purpose concurrency control" I mean an algorithm which allows access multiple objects (or records or tuples, whatever you want to name them) with an all-or-nothing semantics. In other words, an algorithm that lets you do transactions over multiple data items. Two-Phase Locking has several advantages over the other concurrency controls that have since been invented, but in my view there are two important ones: simplicity and a strong isolation level.
In 2PL, before accessing a record for read or write access, we must first take the lock that protects this record. During the transaction, we keep acquiring locks for each access, and only at the end of the transaction, when we know that no further accesses will be made, can we release all the locks. Having an instant in time (i.e. the end of the transaction) where all the locks are taken on the data that we accessed, means that there is a linearization point for our operation (transaction), which means we have a consistent view of the different records and can write to other records all in a consistent way. It doesn't get much simpler than this. Today, this idea may sound embarrassingly obvious, but 50 years ago many database researchers thought that it was ok to release the locks after completing the access to the record. And yes, it is possible to do so, but such a concurrency control is not serializable.
As for strong isolation, database researchers continue to invent concurrency controls that are not serializable, and write papers about it, which means Serializability is not that important for Databases. On the other hand, all transactional commercial databases that I know of, use 2PL or some combination of it with T/O or MVCC.
Moreover, in the field of concurrency data structures, linearizability is the gold standard, which means 2PL is used heavily. If we need to write to multiple nodes of a data structure in a consistent way, we typically need something like 2PL, at least for the write accesses. The exception to this are lock-free data-structures, but hey, that's why (correct) lock-free is hard!
Ok, 2PL is easy to use and has strong isolation, so this means we're ready to go and don't need anything better than 2PL, right? I'm afraid not. 2PL has a couple of big disadvantages: poor read-scalability and live-lock progress.
The classic 2PL was designed for mutual exclusion locks, which means that when two threads are performing a read-access on the same record, they will conflict and one of them (or both) will abort and restart. This problem can be solved by replacing the mutual exclusion locks with reader-writer locks, but it's not as simple as this. Mutual exclusion locks can be implemented with a single bit, representing the state of locked or unlocked. Reader-writer locks also need this bit and in addition, need to have a counter of the number of readers currently holding the lock in read mode. This counter needs enough bits to represent the number of readers. For example, 7 bits means you can have a maximum of 128 threads in the system, in case they all decide to acquire the read-lock on a particular reader-writer lock instance. For such a scenario this implies that each lock would take 1 byte, which may not sound like much, but if you have billions of records in your DB then you will need billions of bytes for those locks. Still reasonable, but now we get into the problem of contention on the counter.
Certain workloads have lots of read accesses on the same data, they are read-non-disjoint. An example of this is the root node of a binary search tree, where all operations need to read the root before they start descending the nodes of the tree. When using 2PL, each of these accesses on the root node implies a lock acquisition and even if we're using read-writer locks, it implies heavy contention on the lock that protects the root node.
Previous approaches have taken a stab at this problem, for example TLRW by Dave Dice and Nir Shavit in SPAA 2010. By using reader-writer locks they were able to have much better performance than using mutual exclusion locks, but still far from what the optimistic concurrency controls can achieve. Take the example of the plot below where we have an implementation similar to TLRW with each read-access contending on a single variable of the reader-writer lock, applied to a binary search tree, a Rank-based Relaxed AVL. Scalability is flat regardless of whether we're doing mostly write-transactions (left side plot) or just read-transactions (right side plot).
Turns out it is possible to overcome this "read-indicator contention" problem through the usage of scalable read-indicators. Our favorite algorithm is a reader-writer lock where each reader announces its arrival/departure on a separate cache line, thus having no contention for read-lock acquisition. The downside is that the thread taking the write-lock must scan through all those cache lines to be able to ascertain whether if the write-lock is granted, thus incurring a higher cost for the write-lock acquisition. As far as I know, the first reader-writer lock algorithms with this technique were shown in the paper "NUMA Aware reader-writer locks" of which Dave Dice and Nir Shavit are two of the authors, along with Irina Calciu, Yossi Lev, Victor Luchangco, and Virenda Marathe This paper shows three different reader-writer lock algorithms, two with high scalability, but neither is starvation-free.
So what we did was take some of these ideas to make a better reader-writer lock, which also scales well for read-lock acquisition but has other properties, and we used this to implement our own concurrency control which we called Two-Phase Locking Starvation-Free (2PLSF). The reader-writer locks in 2PLSF have one bit per thread reserved for the read-lock but they are located in their own cache line, along with the bits (read-indicators) of the next adjacent locks.
Like on the "NUMA-Aware reader-writer locks" paper, the cost shifts to the write-lock acquisition which needs to scan multiple cache lines to acquire the write-lock. There is no magic here, just trade-offs, but this turns out to be a pretty good trade-off as most workloads tend to be on the read-heavy side. Even write-intensive workloads spend a good amount of time executing read-accesses, for example, during the record lookup phase. With our improved reader-writer lock the same benchmark shown previously for the binary search tree looks very different:
With this improved reader-writer lock we are able to scale 2PL even on read-non-disjoint workloads, but it still leaves out the other major disadvantage, 2PL is prone to live-lock.
There are several variants of the original 2PL, some of these variants aren't even serializable, therefore I wouldn't call them 2PL anymore and won't bother going into that. For the classical 2PL, there are three variants and they are mostly about how to deal with contention. They're usually named: - No-Wait - Wait-Or-Die - Deadlock-detection
When a conflict is encountered, the No-Wait variant aborts the self transaction (or the other transaction) and retries again. This retry can be immediate, or it can be later, based on an exponential backoff scheme. The No-Wait approach has live-lock progress because two transactions with one attempting to modify record A and then record B, while the other is attempting to modify record B and then record A, may indefinitely conflict with each other and abort-restart without any of them ever being able to commit.
The Deadlock-Detection variant keeps an internal list of threads waiting on a lock and detects cycles (deadlocks). This is problematic for reader-writer locks because it would require each reader to have its own list, which itself needs a (mutual exclusion) lock to protect it. And detecting the cycles would mean scanning all the readers' lists when the lock is taken in read-lock mode. Theoretically it should be possible to make this scheme starvation-free, but it would require using starvation-free locks, and as there is no (published) highly scalable reader-writer lock with starvation-free progress, it kind of defeats the purpose. Moreover, having one list per reader may have consequences on high memory usage. Who knows, maybe one day someone will try this approach.
The Wait-Or-Die variant imposes an order on all transactions, typically with a timestamp of when the transaction started and, when a lock conflict arises, decides to wait for the lock or to abort, by comparing the timestamp of the transaction with the timestamp of the lock owner. This works fine for mutual exclusion locks as the owner can be stored in the lock itself using a unique-thread identifier, but if we want to do it for reader-writer locks then a thread-id would be needed per reader. If we want to support 256 threads then this means we need 8 bits x 256 = 256 bytes per reader-writer lock. Using 256 bytes per lock is a hard pill to swallow!
But memory usage is not the real obstacle here. The Wait-Or-Die approach implies that all transactions have a unique transaction id so as to order them, for example, they can take a number from an atomic variable using a fetch_and_add() instruction. The problem with this is that on most modern CPUs you won't be able to do more than 40 million fetch_and_add() operations per second on a contended atomic variable. This may seem like a lot (Visa does about 660 million transactions per day, so doing 40 million per second sounds pretty good), but when it comes to in-memory DBMS it's not that much, and particularly for concurrent data structures is a bit on the low side. Even worse, this atomic fetch_and_add() must be done for all transactions, whether they are write-transactions or read-transactions. For example, in one of our machines it's not really possible to go above 20 M fetch_and_add() per second, which means that scalability suckz:
To put this in perspective, one of my favorite concurrency controls is TL2 which was invented by (surprise!) none other than Dave Dice, Nir Shavit and Ori Shalev I hope by now you know who are the experts in this stuff ;)
Anyways, in TL2 the read-transactions don't need to do an atomic fetch_and_add(), and they execute optimistic reads, which is faster than any read-lock acquisition you can think of. At least for read-transactions, TL2 can scale to hundreds of millions of transactions per second. By comparison, 2PL with Wait-Or-Die will never be able to go above 40 M tps/sec. This means if high scalability is your goal, then you would be better off with TL2 than 2PL… except, 2PLSF solves this problem too.
In 2PLSF only the transactions that go into conflict need to be ordered, i.e. only these need to do a fetch_and_add() on a central atomic variable. This has two benefits: it means there is less contention on the central atomic variable that assigns the unique transaction id, and it means that transactions without conflicts are not bounded by the 40 M tps plateau. This means that we can have 200 M tps running without conflict and then 40 M tps that are having conflict, because the conflicting transactions are the only ones that need to do the fetch_and_add() and therefore, and the only ones bounded by the maximum number of fetch_and_adds() the CPU can execute per second. On top of this, the 2PLSF algorithm provides starvation-freedom.
Summary
In this post we saw some of the advantages and disadvantages of 2PL and some of the variants of 2PL. We explained what it takes to scale 2PL: make a better reader-writer lock. But the big disadvantage of 2PL is the live-lock progress, which some variants could seemingly resolve, but in practice they don't because they will not scale, even with a better reader-writer lock. Then we described 2PLSF, a novel algorithm invented by me, Andreia and Pascal Felber to address these issues.
In summary, 2PLSF is what 2PL should have been from the start, a concurrency control that scales well even when reads are non-disjoint and that provides starvation-free transactions, the highest form of blocking progress there is. Moreover, it's pretty good a solving certain kinds of conflicts, which means it can have scalability even some conflicts arise. 2PLSF is not perfect, but it's good enough, certainly better than TL2 when it comes to solving conflicts.
Despite being two-phase locking, it's as close to 2PL as a jackhammer is to a pickaxe.
Three distinct perspectives on scale that come along with building and operating a storage system the size of S3.
... (35962 chars. See body)
Building and operating a pretty big storage system called S3
• 6277 words
Today, I am publishing a guest post from Andy Warfield, VP and distinguished engineer over at S3. I asked him to write this based on the Keynote address he gave at USENIX FAST ‘23 that covers three distinct perspectives on scale that come along with building and operating a storage system the size of S3.
In today’s world of short-form snackable content, we’re very fortunate to get an excellent in-depth exposé. It’s one that I find particularly fascinating, and it provides some really unique insights into why people like Andy and I joined Amazon in the first place. The full recording of Andy presenting this paper at fast is embedded at the end of this post.
–W
Building and operating a pretty big storage system called S3
I’ve worked in computer systems software — operating systems, virtualization, storage, networks, and security — for my entire career. However, the last six years working with Amazon Simple Storage Service (S3) have forced me to think about systems in broader terms than I ever have before. In a given week, I get to be involved in everything from hard disk mechanics, firmware, and the physical properties of storage media at one end, to customer-facing performance experience and API expressiveness at the other. And the boundaries of the system are not just technical ones: I’ve had the opportunity to help engineering teams move faster, worked with finance and hardware teams to build cost-following services, and worked with customers to create gob-smackingly cool applications in areas like video streaming, genomics, and generative AI.
What I’d really like to share with you more than anything else is my sense of wonder at the storage systems that are all collectively being built at this point in time, because they are pretty amazing. In this post, I want to cover a few of the interesting nuances of building something like S3, and the lessons learned and sometimes surprising observations from my time in S3.
17 years ago, on a university campus far, far away…
S3 launched on March 14th, 2006, which means it turned 17 this year. It’s hard for me to wrap my head around the fact that for engineers starting their careers today, S3 has simply existed as an internet storage service for as long as you’ve been working with computers. Seventeen years ago, I was just finishing my PhD at the University of Cambridge. I was working in the lab that developed Xen, an open-source hypervisor that a few companies, including Amazon, were using to build the first public clouds. A group of us moved on from the Xen project at Cambridge to create a startup called XenSource that, instead of using Xen to build a public cloud, aimed to commercialize it by selling it as enterprise software. You might say that we missed a bit of an opportunity there. XenSource grew and was eventually acquired by Citrix, and I wound up learning a whole lot about growing teams and growing a business (and negotiating commercial leases, and fixing small server room HVAC systems, and so on) – things that I wasn’t exposed to in grad school.
But at the time, what I was convinced I really wanted to do was to be a university professor. I applied for a bunch of faculty jobs and wound up finding one at UBC (which worked out really well, because my wife already had a job in Vancouver and we love the city). I threw myself into the faculty role and foolishly grew my lab to 18 students, which is something that I’d encourage anyone that’s starting out as an assistant professor to never, ever do. It was thrilling to have such a large lab full of amazing people and it was absolutely exhausting to try to supervise that many graduate students all at once, but, I’m pretty sure I did a horrible job of it. That said, our research lab was an incredible community of people and we built things that I’m still really proud of today, and we wrote all sorts of really fun papers on security, storage, virtualization, and networking.
A little over two years into my professor job at UBC, a few of my students and I decided to do another startup. We started a company called Coho Data that took advantage of two really early technologies at the time: NVMe SSDs and programmable ethernet switches, to build a high-performance scale-out storage appliance. We grew Coho to about 150 people with offices in four countries, and once again it was an opportunity to learn things about stuff like the load bearing strength of second-floor server room floors, and analytics workflows in Wall Street hedge funds – both of which were well outside my training as a CS researcher and teacher. Coho was a wonderful and deeply educational experience, but in the end, the company didn’t work out and we had to wind it down.
And so, I found myself sitting back in my mostly empty office at UBC. I realized that I’d graduated my last PhD student, and I wasn’t sure that I had the strength to start building a research lab from scratch all over again. I also felt like if I was going to be in a professor job where I was expected to teach students about the cloud, that I might do well to get some first-hand experience with how it actually works.
I interviewed at some cloud providers, and had an especially fun time talking to the folks at Amazon and decided to join. And that’s where I work now. I’m based in Vancouver, and I’m an engineer that gets to work across all of Amazon’s storage products. So far, a whole lot of my time has been spent on S3.
How S3 works
When I joined Amazon in 2017, I arranged to spend most of my first day at work with Seth Markle. Seth is one of S3’s early engineers, and he took me into a little room with a whiteboard and then spent six hours explaining how S3 worked.
It was awesome. We drew pictures, and I asked question after question non-stop and I couldn’t stump Seth. It was exhausting, but in the best kind of way. Even then S3 was a very large system, but in broad strokes — which was what we started with on the whiteboard — it probably looks like most other storage systems that you’ve seen.
Amazon Simple Storage Service - Simple, right?
S3 is an object storage service with an HTTP REST API. There is a frontend fleet with a REST API, a namespace service, a storage fleet that’s full of hard disks, and a fleet that does background operations. In an enterprise context we might call these background tasks “data services,” like replication and tiering. What’s interesting here, when you look at the highest-level block diagram of S3’s technical design, is the fact that AWS tends to ship its org chart. This is a phrase that’s often used in a pretty disparaging way, but in this case it’s absolutely fascinating. Each of these broad components is a part of the S3 organization. Each has a leader, and a bunch of teams that work on it. And if we went into the next level of detail in the diagram, expanding one of these boxes out into the individual components that are inside it, what we’d find is that all the nested components are their own teams, have their own fleets, and, in many ways, operate like independent businesses.
All in, S3 today is composed of hundreds of microservices that are structured this way. Interactions between these teams are literally API-level contracts, and, just like the code that we all write, sometimes we get modularity wrong and those team-level interactions are kind of inefficient and clunky, and it’s a bunch of work to go and fix it, but that’s part of building software, and it turns out, part of building software teams too.
Two early observations
Before Amazon, I’d worked on research software, I’d worked on pretty widely adopted open-source software, and I’d worked on enterprise software and hardware appliances that were used in production inside some really large businesses. But by and large, that software was a thing we designed, built, tested, and shipped. It was the software that we packaged and the software that we delivered. Sure, we had escalations and support cases and we fixed bugs and shipped patches and updates, but we ultimately delivered software. Working on a global storage service like S3 was completely different: S3 is effectively a living, breathing organism. Everything, from developers writing code running next to the hard disks at the bottom of the software stack, to technicians installing new racks of storage capacity in our data centers, to customers tuning applications for performance, everything is one single, continuously evolving system. S3’s customers aren’t buying software, they are buying a service and they expect the experience of using that service to be continuously, predictably fantastic.
The first observation was that I was going to have to change, and really broaden how I thought about software systems and how they behave. This didn’t just mean broadening thinking about software to include those hundreds of microservices that make up S3, it meant broadening to also include all the people who design, build, deploy, and operate all that code. It’s all one thing, and you can’t really think about it just as software. It’s software, hardware, and people, and it’s always growing and constantly evolving.
The second observation was that despite the fact that this whiteboard diagram sketched the broad strokes of the organization and the software, it was also wildly misleading, because it completely obscured the scale of the system. Each one of the boxes represents its own collection of scaled out software services, often themselves built from collections of services. It would literally take me years to come to terms with the scale of the system that I was working with, and even today I often find myself surprised at the consequences of that scale.
S3 by the numbers (as of publishing this post).
Technical Scale: Scale and the physics of storage
It probably isn’t very surprising for me to mention that S3 is a really big system, and it is built using a LOT of hard disks. Millions of them. And if we’re talking about S3, it’s worth spending a little bit of time talking about hard drives themselves. Hard drives are amazing, and they’ve kind of always been amazing.
The first hard drive was built by Jacob Rabinow, who was a researcher for the predecessor of the National Institute of Standards and Technology (NIST). Rabinow was an expert in magnets and mechanical engineering, and he’d been asked to build a machine to do magnetic storage on flat sheets of media, almost like pages in a book. He decided that idea was too complex and inefficient, so, stealing the idea of a spinning disk from record players, he built an array of spinning magnetic disks that could be read by a single head. To make that work, he cut a pizza slice-style notch out of each disk that the head could move through to reach the appropriate platter. Rabinow described this as being like “like reading a book without opening it.” The first commercially available hard disk appeared 7 years later in 1956, when IBM introduced the 350 disk storage unit, as part of the 305 RAMAC computer system. We’ll come back to the RAMAC in a bit.
The first magnetic memory device. Credit: https://www.computerhistory.org/storageengine/rabinow-patents-magnetic-disk-data-storage/
Today, 67 years after that first commercial drive was introduced, the world uses lots of hard drives. Globally, the number of bytes stored on hard disks continues to grow every year, but the applications of hard drives are clearly diminishing. We just seem to be using hard drives for fewer and fewer things. Today, consumer devices are effectively all solid-state, and a large amount of enterprise storage is similarly switching to SSDs. Jim Gray predicted this direction in 2006, when he very presciently said: “Tape is Dead. Disk is Tape. Flash is Disk. RAM Locality is King.“ This quote has been used a lot over the past couple of decades to motivate flash storage, but the thing it observes about disks is just as interesting.
Hard disks don’t fill the role of general storage media that they used to because they are big (physically and in terms of bytes), slower, and relatively fragile pieces of media. For almost every common storage application, flash is superior. But hard drives are absolute marvels of technology and innovation, and for the things they are good at, they are absolutely amazing. One of these strengths is cost efficiency, and in a large-scale system like S3, there are some unique opportunities to design around some of the constraints of individual hard disks.
The anatomy of a hard disk. Credit: https://www.researchgate.net/figure/Mechanical-components-of-a-typical-hard-disk-drive_fig8_224323123
As I was preparing for my talk at FAST, I asked Tim Rausch if he could help me revisit the old plane flying over blades of grass hard drive example. Tim did his PhD at CMU and was one of the early researchers on heat-assisted magnetic recording (HAMR) drives. Tim has worked on hard drives generally, and HAMR specifically for most of his career, and we both agreed that the plane analogy – where we scale up the head of a hard drive to be a jumbo jet and talk about the relative scale of all the other components of the drive – is a great way to illustrate the complexity and mechanical precision that’s inside an HDD. So, here’s our version for 2023.
Imagine a hard drive head as a 747 flying over a grassy field at 75 miles per hour. The air gap between the bottom of the plane and the top of the grass is two sheets of paper. Now, if we measure bits on the disk as blades of grass, the track width would be 4.6 blades of grass wide and the bit length would be one blade of grass. As the plane flew over the grass it would count blades of grass and only miss one blade for every 25 thousand times the plane circled the Earth.
That’s a bit error rate of 1 in 10^15 requests. In the real world, we see that blade of grass get missed pretty frequently – and it’s actually something we need to account for in S3.
Now, let’s go back to that first hard drive, the IBM RAMAC from 1956. Here are some specs on that thing:
Now let’s compare it to the largest HDD that you can buy as of publishing this, which is a Western Digital Ultrastar DC HC670 26TB. Since the RAMAC, capacity has improved 7.2M times over, while the physical drive has gotten 5,000x smaller. It’s 6 billion times cheaper per byte in inflation-adjusted dollars. But despite all that, seek times – the time it takes to perform a random access to a specific piece of data on the drive – have only gotten 150x better. Why? Because they’re mechanical. We have to wait for an arm to move, for the platter to spin, and those mechanical aspects haven’t really improved at the same rate. If you are doing random reads and writes to a drive as fast as you possibly can, you can expect about 120 operations per second. The number was about the same in 2006 when S3 launched, and it was about the same even a decade before that.
This tension between HDDs growing in capacity but staying flat for performance is a central influence in S3’s design. We need to scale the number of bytes we store by moving to the largest drives we can as aggressively as we can. Today’s largest drives are 26TB, and industry roadmaps are pointing at a path to 200TB (200TB drives!) in the next decade. At that point, if we divide up our random accesses fairly across all our data, we will be allowed to do 1 I/O per second per 2TB of data on disk.
S3 doesn’t have 200TB drives yet, but I can tell you that we anticipate using them when they’re available. And all the drive sizes between here and there.
Managing heat: data placement and performance
So, with all this in mind, one of the biggest and most interesting technical scale problems that I’ve encountered is in managing and balancing I/O demand across a really large set of hard drives. In S3, we refer to that problem as heat management.
By heat, I mean the number of requests that hit a given disk at any point in time. If we do a bad job of managing heat, then we end up focusing a disproportionate number of requests on a single drive, and we create hotspots because of the limited I/O that’s available from that single disk. For us, this becomes an optimization challenge of figuring out how we can place data across our disks in a way that minimizes the number of hotspots.
Hotspots are small numbers of overloaded drives in a system that ends up getting bogged down, and results in poor overall performance for requests dependent on those drives. When you get a hot spot, things don’t fall over, but you queue up requests and the customer experience is poor. Unbalanced load stalls requests that are waiting on busy drives, those stalls amplify up through layers of the software storage stack, they get amplified by dependent I/Os for metadata lookups or erasure coding, and they result in a very small proportion of higher latency requests — or “stragglers”. In other words, hotspots at individual hard disks create tail latency, and ultimately, if you don’t stay on top of them, they grow to eventually impact all request latency.
As S3 scales, we want to be able to spread heat as evenly as possible, and let individual users benefit from as much of the HDD fleet as possible. This is tricky, because we don’t know when or how data is going to be accessed at the time that it’s written, and that’s when we need to decide where to place it. Before joining Amazon, I spent time doing research and building systems that tried to predict and manage this I/O heat at much smaller scales – like local hard drives or enterprise storage arrays and it was basically impossible to do a good job of. But this is a case where the sheer scale, and the multitenancy of S3 result in a system that is fundamentally different.
The more workloads we run on S3, the more that individual requests to objects become decorrelated with one another. Individual storage workloads tend to be really bursty, in fact, most storage workloads are completely idle most of the time and then experience sudden load peaks when data is accessed. That peak demand is much higher than the mean. But as we aggregate millions of workloads a really, really cool thing happens: the aggregate demand smooths and it becomes way more predictable. In fact, and I found this to be a really intuitive observation once I saw it at scale, once you aggregate to a certain scale you hit a point where it is difficult or impossible for any given workload to really influence the aggregate peak at all! So, with aggregation flattening the overall demand distribution, we need to take this relatively smooth demand rate and translate it into a similarly smooth level of demand across all of our disks, balancing the heat of each workload.
Replication: data placement and durability
In storage systems, redundancy schemes are commonly used to protect data from hardware failures, but redundancy also helps manage heat. They spread load out and give you an opportunity to steer request traffic away from hotspots. As an example, consider replication as a simple approach to encoding and protecting data. Replication protects data if disks fail by just having multiple copies on different disks. But it also gives you the freedom to read from any of the disks. When we think about replication from a capacity perspective it’s expensive. However, from an I/O perspective – at least for reading data – replication is very efficient.
We obviously don’t want to pay a replication overhead for all of the data that we store, so in S3 we also make use of erasure coding. For example, we use an algorithm, such as Reed-Solomon, and split our object into a set of k “identity” shards. Then we generate an additional set of m parity shards. As long as k of the (k+m) total shards remain available, we can read the object. This approach lets us reduce capacity overhead while surviving the same number of failures.
The impact of scale on data placement strategy
So, redundancy schemes let us divide our data into more pieces than we need to read in order to access it, and that in turn provides us with the flexibility to avoid sending requests to overloaded disks, but there’s more we can do to avoid heat. The next step is to spread the placement of new objects broadly across our disk fleet. While individual objects may be encoded across tens of drives, we intentionally put different objects onto different sets of drives, so that each customer’s accesses are spread over a very large number of disks.
There are two big benefits to spreading the objects within each bucket across lots and lots of disks:
A customer’s data only occupies a very small amount of any given disk, which helps achieve workload isolation, because individual workloads can’t generate a hotspot on any one disk.
Individual workloads can burst up to a scale of disks that would be really difficult and really expensive to build as a stand-alone system.
Here's a spiky workload
For instance, look at the graph above. Think about that burst, which might be a genomics customer doing parallel analysis from thousands of Lambda functions at once. That burst of requests can be served by over a million individual disks. That’s not an exaggeration. Today, we have tens of thousands of customers with S3 buckets that are spread across millions of drives. When I first started working on S3, I was really excited (and humbled!) by the systems work to build storage at this scale, but as I really started to understand the system I realized that it was the scale of customers and workloads using the system in aggregate that really allow it to be built differently, and building at this scale means that any one of those individual workloads is able to burst to a level of performance that just wouldn’t be practical to build if they were building without this scale.
The human factors
Beyond the technology itself, there are human factors that make S3 - or any complex system - what it is. One of the core tenets at Amazon is that we want engineers and teams to fail fast, and safely. We want them to always have the confidence to move quickly as builders, while still remaining completely obsessed with delivering highly durable storage. One strategy we use to help with this in S3 is a process called “durability reviews.” It’s a human mechanism that’s not in the statistical 11 9s model, but it’s every bit as important.
When an engineer makes changes that can result in a change to our durability posture, we do a durability review. The process borrows an idea from security research: the threat model. The goal is to provide a summary of the change, a comprehensive list of threats, then describe how the change is resilient to those threats. In security, writing down a threat model encourages you to think like an adversary and imagine all the nasty things that they might try to do to your system. In a durability review, we encourage the same “what are all the things that might go wrong” thinking, and really encourage engineers to be creatively critical of their own code. The process does two things very well:
It encourages authors and reviewers to really think critically about the risks we should be protecting against.
It separates risk from countermeasures, and lets us have separate discussions about the two sides.
When working through durability reviews we take the durability threat model, and then we evaluate whether we have the right countermeasures and protections in place. When we are identifying those protections, we really focus on identifying coarse-grained “guardrails”. These are simple mechanisms that protect you from a large class of risks. Rather than nitpicking through each risk and identifying individual mitigations, we like simple and broad strategies that protect against a lot of stuff.
Another example of a broad strategy is demonstrated in a project we kicked off a few years back to rewrite the bottom-most layer of S3’s storage stack – the part that manages the data on each individual disk. The new storage layer is called ShardStore, and when we decided to rebuild that layer from scratch, one guardrail we put in place was to adopt a really exciting set of techniques called “lightweight formal verification”. Our team decided to shift the implementation to Rust in order to get type safety and structured language support to help identify bugs sooner, and even wrote libraries that extend that type safety to apply to on-disk structures. From a verification perspective, we built a simplified model of ShardStore’s logic, (also in Rust), and checked into the same repository alongside the real production ShardStore implementation. This model dropped all the complexity of the actual on-disk storage layers and hard drives, and instead acted as a compact but executable specification. It wound up being about 1% of the size of the real system, but allowed us to perform testing at a level that would have been completely impractical to do against a hard drive with 120 available IOPS. We even managed to publish a paper about this work at SOSP.
From here, we’ve been able to build tools and use existing techniques, like property-based testing, to generate test cases that verify that the behaviour of the implementation matches that of the specification. The really cool bit of this work wasn’t anything to do with either designing ShardStore or using formal verification tricks. It was that we managed to kind of “industrialize” verification, taking really cool, but kind of research-y techniques for program correctness, and get them into code where normal engineers who don’t have PhDs in formal verification can contribute to maintaining the specification, and that we could continue to apply our tools with every single commit to the software. Using verification as a guardrail has given the team confidence to develop faster, and it has endured even as new engineers joined the team.
Durability reviews and lightweight formal verification are two examples of how we take a really human, and organizational view of scale in S3. The lightweight formal verification tools that we built and integrated are really technical work, but they were motivated by a desire to let our engineers move faster and be confident even as the system becomes larger and more complex over time. Durability reviews, similarly, are a way to help the team think about durability in a structured way, but also to make sure that we are always holding ourselves accountable for a high bar for durability as a team. There are many other examples of how we treat the organization as part of the system, and it’s been interesting to see how once you make this shift, you experiment and innovate with how the team builds and operates just as much as you do with what they are building and operating.
Scaling myself: Solving hard problems starts and ends with “Ownership”
The last example of scale that I’d like to tell you about is an individual one. I joined Amazon as an entrepreneur and a university professor. I’d had tens of grad students and built an engineering team of about 150 people at Coho. In the roles I’d had in the university and in startups, I loved having the opportunity to be technically creative, to build really cool systems and incredible teams, and to always be learning. But I’d never had to do that kind of role at the scale of software, people, or business that I suddenly faced at Amazon.
One of my favourite parts of being a CS professor was teaching the systems seminar course to graduate students. This was a course where we’d read and generally have pretty lively discussions about a collection of “classic” systems research papers. One of my favourite parts of teaching that course was that about half way through it we’d read the SOSP Dynamo paper. I looked forward to a lot of the papers that we read in the course, but I really looked forward to the class where we read the Dynamo paper, because it was from a real production system that the students could relate to. It was Amazon, and there was a shopping cart, and that was what Dynamo was for. It’s always fun to talk about research work when people can map it to real things in their own experience.
But also, technically, it was fun to discuss Dynamo, because Dynamo was eventually consistent, so it was possible for your shopping cart to be wrong.
I loved this, because it was where we’d discuss what you do, practically, in production, when Dynamo was wrong. When a customer was able to place an order only to later realize that the last item had already been sold. You detected the conflict but what could you do? The customer was expecting a delivery.
This example may have stretched the Dynamo paper’s story a little bit, but it drove to a great punchline. Because the students would often spend a bunch of discussion trying to come up with technical software solutions. Then someone would point out that this wasn’t it at all. That ultimately, these conflicts were rare, and you could resolve them by getting support staff involved and making a human decision. It was a moment where, if it worked well, you could take the class from being critical and engaged in thinking about tradeoffs and design of software systems, and you could get them to realize that the system might be bigger than that. It might be a whole organization, or a business, and maybe some of the same thinking still applied.
Now that I’ve worked at Amazon for a while, I’ve come to realize that my interpretation wasn’t all that far from the truth — in terms of how the services that we run are hardly “just” the software. I’ve also realized that there’s a bit more to it than what I’d gotten out of the paper when teaching it. Amazon spends a lot of time really focused on the idea of “ownership.” The term comes up in a lot of conversations — like “does this action item have an owner?” — meaning who is the single person that is on the hook to really drive this thing to completion and make it successful.
The focus on ownership actually helps understand a lot of the organizational structure and engineering approaches that exist within Amazon, and especially in S3. To move fast, to keep a really high bar for quality, teams need to be owners. They need to own the API contracts with other systems their service interacts with, they need to be completely on the hook for durability and performance and availability, and ultimately, they need to step in and fix stuff at three in the morning when an unexpected bug hurts availability. But they also need to be empowered to reflect on that bug fix and improve the system so that it doesn’t happen again. Ownership carries a lot of responsibility, but it also carries a lot of trust – because to let an individual or a team own a service, you have to give them the leeway to make their own decisions about how they are going to deliver it. It’s been a great lesson for me to realize how much allowing individuals and teams to directly own software, and more generally own a portion of the business, allows them to be passionate about what they do and really push on it. It’s also remarkable how much getting ownership wrong can have the opposite result.
Encouraging ownership in others
I’ve spent a lot of time at Amazon thinking about how important and effective the focus on ownership is to the business, but also about how effective an individual tool it is when I work with engineers and teams. I realized that the idea of recognizing and encouraging ownership had actually been a really effective tool for me in other roles. Here’s an example: In my early days as a professor at UBC, I was working with my first set of graduate students and trying to figure out how to choose great research problems for my lab. I vividly remember a conversation I had with a colleague that was also a pretty new professor at another school. When I asked them how they choose research problems with their students, they flipped. They had a surprisingly frustrated reaction. “I can’t figure this out at all. I have like 5 projects I want students to do. I’ve written them up. They hum and haw and pick one up but it never works out. I could do the projects faster myself than I can teach them to do it.”
And ultimately, that’s actually what this person did — they were amazing, they did a bunch of really cool stuff, and wrote some great papers, and then went and joined a company and did even more cool stuff. But when I talked to grad students that worked with them what I heard was, “I just couldn’t get invested in that thing. It wasn’t my idea.”
As a professor, that was a pivotal moment for me. From that point forward, when I worked with students, I tried really hard to ask questions, and listen, and be excited and enthusiastic. But ultimately, my most successful research projects were never mine. They were my students and I was lucky to be involved. The thing that I don’t think I really internalized until much later, working with teams at Amazon, was that one big contribution to those projects being successful was that the students really did own them. Once students really felt like they were working on their own ideas, and that they could personally evolve it and drive it to a new result or insight, it was never difficult to get them to really invest in the work and the thinking to develop and deliver it. They just had to own it.
And this is probably one area of my role at Amazon that I’ve thought about and tried to develop and be more intentional about than anything else I do. As a really senior engineer in the company, of course I have strong opinions and I absolutely have a technical agenda. But If I interact with engineers by just trying to dispense ideas, it’s really hard for any of us to be successful. It’s a lot harder to get invested in an idea that you don’t own. So, when I work with teams, I’ve kind of taken the strategy that my best ideas are the ones that other people have instead of me. I consciously spend a lot more time trying to develop problems, and to do a really good job of articulating them, rather than trying to pitch solutions. There are often multiple ways to solve a problem, and picking the right one is letting someone own the solution. And I spend a lot of time being enthusiastic about how those solutions are developing (which is pretty easy) and encouraging folks to figure out how to have urgency and go faster (which is often a little more complex). But it has, very sincerely, been one of the most rewarding parts of my role at Amazon to approach scaling myself as an engineer being measured by making other engineers and teams successful, helping them own problems, and celebrating the wins that they achieve.
Closing thought
I came to Amazon expecting to work on a really big and complex piece of storage software. What I learned was that every aspect of my role was unbelievably bigger than that expectation. I’ve learned that the technical scale of the system is so enormous, that its workload, structure, and operations are not just bigger, but foundationally different from the smaller systems that I’d worked on in the past. I learned that it wasn’t enough to think about the software, that “the system” was also the software’s operation as a service, the organization that ran it, and the customer code that worked with it. I learned that the organization itself, as part of the system, had its own scaling challenges and provided just as many problems to solve and opportunities to innovate. And finally, I learned that to really be successful in my own role, I needed to focus on articulating the problems and not the solutions, and to find ways to support strong engineering teams in really owning those solutions.
I’m hardly done figuring any of this stuff out, but I sure feel like I’ve learned a bunch so far. Thanks for taking the time to listen.
A step by step guide for solving a difficult organizational problem, including notes on single stack ranks, team interdependencies, building consensus, reducing work in progress, and how to move your company towards bett
(Open link)
Ted Chiang on how artificial intelligence may strengthen capitalism by promising to concentrate wealth and disempower workers, and on possible alternatives.
(Open link)
For the past several years, I’ve run a learning circle with engineering executives. The most frequent topic that comes up is career management–what should I do next? The second most frequent topic is measuring engineerin
(Open link)
Disclamer: Please note that I am not a lawyer. All the information provided here related to how I've integrated ChatGPT into my workflow should be considered with caution. It is important to use your own judgement when d
(Open link)
Resilience is coping with unexpected events and environmental change. To have resilience, you need slack. Slack in software development lets people do the little tasks that keep the work moving smoothly. That helps with
(Open link)
Python programs, usually short, of considerable difficulty, to perfect particular skills. - pytudes/Probability.ipynb at main · norvig/pytudes
(Open link)
Day 4/90 My friend Luca once told me that if I wanted to make real progress with my photography, I needed to stop taking photos randomly and instead work on series. It makes a lot of sense. By focusing on speci
(Open link)
Did you have a situation when you lost a ton of time finding a Go library for your need? In theory, you can check lists like Awesome Go or make a choice based on GitHub stars. But Awesome Go contains over 2600 libraries,
(Open link)
In this post I reflect on our progress so far, and some of the interesting challenges we are facing while building the BBC’s critical digital services. If you’re a technology builder, interested in…
(Open link)
In this post, I hope to explore different forms of “testing in production”, when each form of testing is the most beneficial as well as how to test services in production in a safe way. However…
(Open link)
Pleased to Meet You... Video version available on YouTube: In 2024, I’ll have been a programmer for 40 years. I’m not quite there yet, but I’ll get there. T
(Open link)
Information Theory is one of the most useful things for Computer Scientists to understand about statistics, and it's deeply related to things we do every day. If you are comfortable writing code but sometimes intimidated
(Open link)
Detailed Class on Carabiners. You will learn why there are so many different Shapes / Sizes & Styles of the Carabiners. How to choose & use them Safely and a...
(Open link)
Posted on Feb 8 2023 This year (2023) is the year of platform engineering. Don’t believe me? Here is the esteemed (and it has to be said lovely) Charles Humble totally accurately quoting the ravings of some madman o
(Open link)
When someone sends you a pull request from a fork or branch of your repository, you can merge it locally to resolve a merge conflict or to test and verify the changes before merging on GitHub.
(Open link)
Steve Jobs, one of the computer industry’s foremost entrepreneurs, gives a wide-ranging talk to a group of MIT Sloan School of Management students in the spr...
(Open link)
The GOVNO framework is a novel approach to project management that aims to improve upon the shortcomings of the popular scrum methodology. Each letter of the acronym represents a key aspect of the framework: G: Governan
(Open link)
Whether you implement or not microservices (and you probably shouldn’t), your system is most probably composed of multiple components. The most straightforward system is probably made of a reverse proxy, an app, and a da
(Open link)
Credit: Comfreak / Pixabay By now, we’ve all heard of the great resignation. Over the past 18 months or so, many people have had more time to think about what they want from their jobs, and the kind of conditions they
(Open link)
The handheld cellphone is 50 years old and has become an essential multi-tool that helps us run our lives. But is it altering the way our brains work?
(Open link)
Raycast API Search… ⌘K Introduction LINKS Community GitHub Store Icon Generator Extension Icon Template BASICS Getting Started Create Your First Extension Contribute to an Extension Prepare an Extension for Store Publish
(Open link)
Related Resources Michael Nielsen on Twitter Michael Nielsen's project announcement mailing list cognitivemedium.com By Michael Nielsen One day in the mid-1920s, a Moscow newspaper
(Open link)
From navigating code to debugging tests, building Go projects in VS Code has never been easier with the latest editor tools. This demo showcased how Go in V...
(Open link)
Software systems are increasingly based on data, rather than code. A new class of tools and technologies have emerged to process data for both analytics and ML.
(Open link)
When you‘re in the midst of starting a business, while also writing a book, like me and my business partner Sara currently are with The Intentional Organization, your mind can feel all over the place. It‘s constantly ove
(Open link)
All I wanted was a widget. A few days ago, as I was playing around with my Lock Screen on iOS 16, I wondered: would it be possible to use the hidden Apple Notes URL scheme to create widget launchers to reopen specific no
(Open link)
Is the ping of a text stealing our focus or do we just lack willpower? And could mindless scrolling ever be good for our brains? Elle Hunt unpacks some surprising truths
(Open link)
Alfred lets you automate anything on your Mac. Chris Messina shows us how to set up Alfred and create desktop workflows to boost productivity. Chapters - St...
(Open link)
Hacker News new | threads | past | comments | ask | show | jobs | submit hanlec (195) | logout Why is Booz Allen renting us back our own national parks? (mattstoller.substack.com) 381 points by PaulHoule 2 days ago | fla
(Open link)
A handful of mental models that, when combined, will provide a clear understanding of what is required to be a great engineering leader at different stages of your company.
(Open link)
Twitter is an online social networking service where users can post and read short messages called “tweets”. Most candidates will design twitter as a monolithic service in system design interviews…
(Open link)
Most of the forest lives in the shadow of the giants that make up the highest canopy. These are the oldest trees, with hundreds of children and grandchildren. They check in with their neighbors, share food, supplies and
(Open link)
So 2023 is the year of the cloud development environment (CDE). That’s like one of the longest running jokes in tech- it’s the year of Linux on the desktop. Or maybe we’re finally approaching the time that a cloud-first
(Open link)
Tripsy is more than just an app for storing details about your upcoming trips. It does that and does it well, but it’s also a great way to revisit old trips and get inspired about places you want to visit in the future.
(Open link)
If you're a bit sad like me, one of the most interesting features of iOS 15 is Focus Mode. Bringing some much-needed updates to do not disturb that went before it, and also making your phone much more customisable in dif
(Open link)
Back in 2019. At that time, I was deploying a data lake strategy inside Michelin and we had decided to build the main one as a platform to enable more data use case. I was having a coffee while reading my mail. In my box
(Open link)
Podman Desktop is a free alternative to Docker Desktop that’s another great option for local development use. It offers a similar feature set while remaining entirely open-source, letting you avoid the licensing implicat
(Open link)
Over recent months, tech companies have been laying workers off by the thousands. It is estimated that in 2022 alone, over 120,000 people have been dismissed from their job at some of the biggest players in tech – Meta,
(Open link)
This guide shows you how to write, structure, visualize and manage software architecture documentation using appropriate documentation tools.
(Open link)
Tags Go , Go tooling Most tutorials on Go tooling (and probably most other tooling) tend to focus on the happy path - the input code is perfectly vali
(Open link)
Introduction this collection of thoughts on software development gathered by grug brain developer grug brain developer not so smart, but grug brain developer program many long year and learn some things although mostly s
(Open link)
As we enter the ninth iteration of watchOS, I must admit that I sometimes find myself looking back wistfully on the computer watch that the Apple Watch once was. My inner tech nerd misses the wild, blind shots at digital
(Open link)
Have you presented to company executives about a key engineering initiative, walking into the room excited and leaving defeated? Maybe you only made it to your second slide before unrelated questions derailed the discuss
(Open link)
For half a century, the Stanford computer scientist Donald Knuth, who bears a slight resemblance to Yoda — albeit standing 6-foot-4 and wearing glasses — has reigned as the spirit-guide of the algorithmic realm. He is th
(Open link)
Hacker News new | threads | past | comments | ask | show | jobs | submit hanlec (195) | logout Slack’s Incident on 2-22-22 (slack.engineering) 286 points by alphabettsy 1 day ago | flag | hide | past | favorite | 165 com
(Open link)
JavaOne is back! ➱ https://oracle.com/javaone Java 8 launched in March of 2014, Java 18 in March of 2022. There are 8 years of progress, 203 JDK Enhancemen...
(Open link)
At Nile, we’re making it easier for companies to build world-class control planes for their infra SaaS products. Multi-tenancy is core to all SaaS products and especially those with control-plane architectures. At Nile,
(Open link)
Based on experience gained from several microservices migrations, these seven lessons can help you be successful and overcome or avoid common challenges.
(Open link)
When you are new to Docker, the number of commands to study might be truly overwhelming. This article shows a way to internalize the most important Docker commands without the brute-force memorization.
(Open link)
In Zeebe.io — a horizontally scalable distributed workflow engine I explained that Zebee is a super performant, highly scalable and resilient cloud-native workflow engine (yeah — buzzwords checked!)…
(Open link)
In the world of modern portable devices, it may be hard to believe that merely a few decades ago the most convenient way to keep track of time was a mechanical watch. Unlike their quartz and smart siblings, mechanical wa
(Open link)
The 150 year old device that made a 50 year old Integrated Development Environment (IDE) rock Engelbart keyset: One handed chorded keyboard Photograph @Richard Mark courtesy of Computer
(Open link)
So we have two questions here: One, how do we draw boundaries around our computer programs such that we minimize the risk of any one concept failing? Two, how do we define any one term such that it is impossible for ther
(Open link)
Join the heated discussion about this article on Hacker News 🔥 Junior Engineers care about writing Software. They value code quality, employ best pra
(Open link)
An eight-week training plan that will build serious chest muscle. The workout is made up of bodyweight exercises so is perfect for a home workout plan
(Open link)
A .mvn directory in the root of our project can contains some useful extras. For example we can set default Maven options or Java VM options when we run a Maven command. We can also define Maven extensions we want to add
(Open link)
It’s no secret that I’m not a fan of big-company HR practices. I’m more of the First Break all the Rules type. Despite my general skepticism of many standard practices, we still do annual performance reviews at my comp
(Open link)
You can use this guide to understand what Java's Project loom is all about and how its virtual threads (also called 'fibers') work under the hood.
(Open link)
For the term "user space" as used in Wikipedia, see Wikipedia:User pages. "Kernel space" redirects here. For the mathematical definition, see Null space. This article needs additional citations for verification. Please h
(Open link)
By preemptively taking action if you expect to receive a bad performance review, you may be able to steer a different course for yourself and avoid a foregone conclusion — or at least feel better about the outcome. The a
(Open link)
How container networking works under the hood? Setting up docker-like container networking from scratch. Bonus: podman rootless container networking explained.
(Open link)
The aim of the present study was to determine the effect of different nap opportunity durations on short-term maximal performance, attention, feelings, muscle soreness, fatigue, stress and sleep. Twenty physically active
(Open link)
As we collect various observability signals from systems, it fosters a new conversation around the classification of the signals. There is a significant discussion on observability signals and even…
(Open link)
https://twitter.com/mipsytipsy/status/1309604622554640384 There are few engineering topics that provoke as much heated commentary as oncall. Everybody has a strong opinion. So let me say straight up that there are few if
(Open link)
Basically, sake comprises rice, water, and the fermenting agent called koji, resulting in an alcoholic level that usually sits between 13 and 16 percent. And you might be interested to know that the rice used is differen
(Open link)
Making Sense Of Single Action Lists In OmniFocus OmniFocus knows three different types of Projects: Sequential Projects, Parallel Projects and Single Action Lists. You would typically think that most...
(Open link)
Oct 3, 2014 Literate programming: Knuth is doing it wrong Literate programming advocates this: Order your code for others to read, not for the compiler. Beautifully typeset your code so one can curl up in bed to read it
(Open link)
Hundreds of companies have no access to JIRA, Confluence and Atlassian Cloud. What can engineering teams learn from the poor handling of this outage?
(Open link)
Copyright Disclaimer Under Section 107 of the Copyright Act 1976, allowance is made for fair use for purposes such as criticism, comment, news reporting, tea...
(Open link)
Welcome to the third interview on 'The Observer Effect'. We are lucky to have one of the most interesting founders in technology and commerce - Tobi Lütke, Founder and CEO of Shopify. This interview was
(Open link)
We recently had the pleasure of participating in QCon, a global conference that gathers the best engineers from top-notch innovation companies. The event covers a wide range of relevant software…
(Open link)
The SQL standard defines isolation levels, but we don't have a great vocabulary for talking about consistency levels, and how these two things interact.
(Open link)
Resignations happen in a moment, and it’s not when you declare, “I’m resigning.” The moment happened a long time ago when you received a random email from a good friend who asked, “I know you’re really happy with your cu
(Open link)
I recently contributed to a series of fireside chats hosted by LaunchDarkly. One of the themes that we discussed was Developer Experience, and how to improve it. I was asked by one of the attendees at the EMEA event (vid
(Open link)
At most companies, people put together a deck, reserve a room (physical or virtual), and call a meeting to pitch a new idea. If they're lucky, no one interrupts them while they're presenting. When it's over, people react
(Open link)
In a frank conversation, the Apple CEO offers new insight into his leadership—and explains how he has refashioned the world’s most creative company (from its privacy policy to its Oscar-winning movies to what’s coming ne
(Open link)
Kubernetes. Nowadays, it seems companies in the industry are divided into two pools: those that already use it heavily for their production workloads and those that are migrating their workloads into…
(Open link)
Selecting the correct MTT* metric to improve your incident response is important. If the wrong metric is chosen, the improvements may get lost in the noise of a multivariable equation.
(Open link)
Fallacies of distributed systems are a set of assertions made by L Peter Deutsch and others at Sun Microsystems describing false assumptions that programmers new to distributed applications invariably make.
(Open link)
Manage different toggles differently As discussed earlier, there are various categories of Feature Toggles with different characteristics. These differences should be embraced, and different toggles
(Open link)
When most folks talk about the economics of cloud systems, their focus is on automatically scaling for long-term seasonality: changes on the order of days (fewer people buy things at night), weeks (fewer people visit the
(Open link)
I have watched many many TV series. All of them I’ve tried to memorize in some extent. Just for pleasure of being able to rewatch some scenes and recall episodes in order, and thus not forget them, but also for using the
(Open link)
A beautiful program to read your RSS feeds right in the terminal! - GitHub - TypicalAM/goread: A beautiful program to read your RSS feeds right in the terminal!
(Open link)
Helidon 4.0.0-ALPHA1 is now released with our brand new Helidon Níma, providing a virtual threads-based web server. This is an early access release for those of you interested in the latest Java…
(Open link)
HTML, CSS, JavaScript, Python, PHP, C++, Dart — there are so many programming languages out there and you may even be totally fluent in several of them! But
(Open link)
"...the mere consciousness of an engagement will sometimes worry a whole day." – Charles Dickens July 2009 One reason programmers dislike meetings so much is that they're on a different type of schedule from other peo
(Open link)
Dark Sky, my weather app of choice for several years, is no more. This is sad, but I’ve done what any sensible person would and downloaded 12 weather apps from the App Store to find a suitable replacement on my iPhone (I
(Open link)
In this post, learn how relational and NoSQL databases, Google Cloud Spanner and DataStax Astra DB, optimize distributed joins for real-time applications. Distributed joins are commonly considered to…
(Open link)
It’s time for another newsletter from the Omni Group, trumpeting forth from the dark of winter here in chilly Seattle. We’ve got just the thing to help you keep warm (or stay cool, for our dear readers south of the Equat
(Open link)
An introduction to Kafka's architecture and the design mechanics that support Kafka's powerful, real-time data streaming and integration features.
(Open link)
August 2, 2020 Pat Helland Recently, there has been a lot of interest in services. These can be microservices or just services. In each case, the service provides a function with its own code and data, and opera
(Open link)
Have you ever found that you enter the week without a clear plan of action? Do you wish that you could articulate each team member’s role and responsibilities more clearly? In a dynamic, fast-paced…
(Open link)
A step by step guide for solving a difficult organizational problem, including notes on single stack ranks, team interdependencies, building consensus, reducing work in progress, and how to move your company towards bett
(Open link)
When someone tells you that a particular video game (like Elden Ring) is hard, it can be tough to figure out what they might mean by that because games are hard in different ways. As Tolstoy might have said had he been a
(Open link)
Hacker News new | threads | past | comments | ask | show | jobs | submit hanlec (194) | logout Authorization in a Microservices World (alexanderlolis.com) 239 points by zaplin 1 day ago | flag | hide | past | favorite |
(Open link)
KOTTKE.ORG ♥ HOME OF FINE HYPERTEXT PRODUCTS HOMEABOUTARCHIVEPODCASTNEWSLETTERMEMBERSHIP! A Demo of Pockit, a Tiny, Powerful, Modular Computer posted by Jason Kottke Mar 15, 2022 Admission time: it’s been a long t
(Open link)
What is a LAN? What is a network segment? Ethernet collision domains vs broadcast domains. How network switches work? How to send IP packets? VLAN vs VXLAN.
(Open link)
Organizations are re-architecting their traditional monolithic applications to incorporate microservices. This helps them gain agility and scalability and accelerate time-to-market for new features. Each microser
(Open link)
Sometimes music can be so beautiful that words seem to fall short. there are tons of words to describe music that can work like a perfect pitch.
(Open link)
We all believe we should exercise more. So why is it so hard to keep it up? Daniel E Lieberman, Harvard professor of evolutionary biology, explodes the most common and unhelpful workout myths.
(Open link)
Table of ContentsStep 1: Become a UserStep 2: Build the ProjectStep 3: Learn the Hot-Path InternalsTrace DownLearn UpExperiment and Break ThingsSupplement with MediaStep 4: Read and Reimplement Recent CommitsStep 5: Make
(Open link)
a) why this name instead of “repeaters”, the most used name in the English-speaking world, and b) why put forward this method and compare its effects on strength and endurance with those obtained through maximum strengt
(Open link)
Software systems are increasingly based on data, rather than code. And a new class of tools and technologies have emerged to process data for both analytics and operational AI/ ML.
(Open link)
A resilient system continues to operate successfully in the presence of failures. There are many possible failure modes, and each exercises a different aspect of resilience. The system needs to…
(Open link)
👨💻 For the price of $7.99 every month, sign up and gain access to a growing list of premium courses on my site - https://tutorialedge.net/pricing/ Welcome...
(Open link)
Software is rarely designed in isolation. It doesn't happen by one person going away in their corner for some time, then coming back with a thoroughly thought-out map of what to build and how. And as they explain the app
(Open link)
Go 1, the first stable release of Go, came with a compatibility promise. This talk will explain why that's important, what it does and doesn't mean, and the ...
(Open link)
September 27, 2011 — Posted in Training Goal Setting Reverse Pyramid Training Weight Training Dear readers, it is with troublesome news I break my three months of silence. The
(Open link)
I spend way too much time on video conferences. Between my day to day at VMware, and the Kubernetes/CNCF communities I probably spend >50% of my time on Zoom. (There is an option to enable keyboard…
(Open link)
This article has a large gap in the story: it ignores sensor data sources, which are both the highest velocity and highest volume data models by multiple orders of magnitude. They have become ubiquitous in diverse, mediu
(Open link)
Despite advances in browser tooling, automated evaluation, lab tools, guidance, and runtimes, modern teams struggle to deliver even decent performance with today's popular frameworks. This is not a technical problem per
(Open link)
Sometimes you have the requirement to calculate percentages on some values of your data. There are multiple ways of doing it, of course, but often people are not aware that you do not have to calculate these percentages
(Open link)
Reproducible builds are a set of software development practices that create an independently-verifiable path from source to binary code.https://reproducible-builds.org/Reproducible builds are important and provide benefi
(Open link)
Almost everyone who does great work toils in relative obscurity. Performance reviews are social fiction. How do people really advance through the corporate hierarchy?
(Open link)
The smartest people in the world use mental models to make intelligent decisions, avoid stupidity, and increase productivity. Let's take a look at how ...
(Open link)
A collection of articles on common problems startups face when scaling. 16 March 2022 Tim Cochran, Carl Nygard, and Roni Smith Accumulation of tech debt; experiments and shortcuts are core components Thro
(Open link)
NOTE: Although I was at Spotify for around 8 years, I‘m not familiar with how every area worked and I have my own biases, preferences, etc. AND things change I will define “Spotify Model” as any…
(Open link)
By now, you have probably heard of OpenAI’s ChatGPT, or any of the alternatives GPT-3, GPT-4, Microsoft’s Bing Chat, Facebook’s LLaMa or even Google’s Bard. They are artificial intelligence programs that can participate
(Open link)
Hacker News new | threads | past | comments | ask | show | jobs | submit hanlec (194) | logout How to organize yourself as a solo founder (medium.com/jeel_shah) 135 points by gekkostate 9 hours ago | flag | hide | past |
(Open link)
Welcome to engineering management. It’s fun, it’s exhausting, it’s rewarding — but most importantly it’s new! What worked for you before won’t work now. You’ll have to acquire a new set of skills, and shed some bad habit
(Open link)
There is something legendary and historic happening in software engineering, right now as we speak, and yet most of you don’t realize at all how big it is.
(Open link)
Welcome to the Rust Book experiment, and thank you for your participation! First, we want to introduce you to the new mechanics of this experiment. The main mechanic is quizzes: each page has a few quizzes about the pag
(Open link)
SSH port forwarding explained in a clean and visual way. How to use local and remote port forwarding. What sshd settings may need to be adjusted. How to memorize the right flags.
(Open link)
Learn how you can use the Virtualization framework to quickly create virtual machines on your Mac. We'll show you how to create a virtual...
(Open link)
59 votes, 31 comments. For those that have read the "The Climbing Bible: Technical, physical and mental training for rock climbing" by Martin …
(Open link)
You’ve probably never heard of a blurgit or a swalloop or a grawlix or an agitron, but you see them every day in your newspaper’s comics section. Here’s a primer on the secret language of comic symbols.
(Open link)
Ages ago, a work colleague / friend offered me the following piece of advice: “We need to make sure you’re seen as strategic”. I’ve thought about it a lot ever since. It was the first time I realised I was on the edge
(Open link)
Microservices show where Java lags behind other languages. Reactive programming provides a concise DSL to express the movement of state and to write concurrent, multithreaded code with better scaling.
(Open link)
Arthur C. Brooks and Lori Gottlieb discuss the importance of fun, the cultural distortion of emotions as “good” or “bad,” and how envy points you in the direction of your deepest desires.
(Open link)
Recently I wrote how I got comfortable with writing and publishing more frequently on this website. I thought I would share some specific things I did to make writing happen for me this year.
(Open link)
I put this mind map together a few years ago after a discussion about what a new VP of Engineering or CTO should think about in their first 90 days in a new role. It’s a map of the areas I believe every new technical le
(Open link)
Hacker News new | threads | past | comments | ask | show | jobs | submit hanlec (199) | logout De-cloud and de-k8s – bringing our apps back home (37signals.com) 525 points by mike1o1 1 day ago | flag | hide | past | favo
(Open link)
The next few posts will be a bit of a wander. I promise we will get back to software design with new tools for thinking about our fundamental dictum: software design is an exercise in human relationships. CEO: I take ful
(Open link)
Often when an organization is going through some turmoil, executives think to themselves, “Ah, I should have some one-on-ones with the team so they can hear how we’re handling this.” On the other side, I frequently hear
(Open link)
Sleep Med. Author manuscript; available in PMC 2018 Sep 1.Published in final edited form as:PMCID: PMC5598771NIHMSID: NIHMS858129PMID: Author information Copyright and License information DisclaimerAbstractThe mid-day na
(Open link)
The phrase "digital garden" is a metaphor for thinking about writing and creating that focuses less on the resulting "showpiece" and more on the process, care, and craft it takes to get there. While not everybody has or
(Open link)
In a way, an error message tells a story; and as with every good story, you need to establish some context about its general settings. For an error message, this should tell the recipient what the code in question was tr
(Open link)
Passkeys are a new way to log into websites and apps that replaces passwords. The industry-standard passkey technology is simpler and more secure than passwords (even with two-factor authentication), resists phishing, an
(Open link)
iPadOS 16. At its keynote held earlier today online and, for a limited audience of developers and media, in Cupertino, Apple unveiled the next major versions of iOS and iPadOS: iOS 16 and iPadOS 16. Both OSes will be rel
(Open link)
Hacker News new | threads | past | comments | ask | show | jobs | submit hanlec (195) | logout The best leaders are great individual contributors, not professional managers (inc.com) 428 points by wslh 1 day ago | flag |
(Open link)
I happened across an interesting thread on Twitter today which posed the question about when you should intervene with alternative solutions as a table gets larger over time: It is always good practice to be proactive on
(Open link)
OWEN D. POMERY WORK SHOP ABOUT + CONTACT EDITORIAL RELIEFS EDITION AIRBNB FLAT EYE ERNEST POHA HOUSE CONCEPT SENET VICTORY POINT KIOSK SCI-FI SPOT ILLUSTRATIONS GAME OF THRONES (GAME) NARRATIVE ARCHIT
(Open link)
New research shows that people with blue eyes have a single, common ancestor. Scientists have tracked down a genetic mutation which took place 6,000-10,000 years ago and is the cause of the eye color of all blue-eyed hum
(Open link)
North Texas’ resident tech genius, John Carmack, is taking aim now at his most ambitious target: solving the world’s biggest computer-science problem by developing artificial general intelligence. That’s a form of AI who
(Open link)
sebastiandaschner blog About Workshops Courses Consulting Blog News More Contact Helpful Command Line Scripts and Automation in My Setup #productivity #commandline friday, january 06, 2023 In this video, I’ve put togethe
(Open link)
When I started my journey at LinkedIn ten years ago, the company was just beginning to experience extreme growth in the volume, variety, and velocity of our data. Over the next few years, my colleagues and I in LinkedIn’
(Open link)
A deep dive with five Technical Program Managers (TPM) on what the role is, how it evolved, and how engineers and managers can benefit from working with TPMs.
(Open link)
The need for ad-hoc real-time data analysis has been growing at Uber. They run a large Apache Kafka deployment and need to analyse data going through the many workflows it supports. Solutions like str
(Open link)
When is the last time your iPhone truly surprised you? The answer to this question is a fascinating Rorschach test that can say a lot about a person’s relationship with Apple’s mobile platform. Some might say it was over
(Open link)
Ever since I started to work on the Apache APISIX project, I’ve been trying to improve my knowledge and understanding of REST RESTful HTTP APIs. For this, I’m reading and watching the following sources: Books. At the mom
(Open link)
Most command line programs that offer line editing – like bash, Python, GDB, psql, sqlite and more – do so using GNU readline. Readline's a powerful library that grants history, completion, movement and editing to progra
(Open link)
By POM I mean Apache Maven's Project Object Model. The POM format is widely used not just by Apache Maven to build and consume projects, but also by other tools such as IDEs, build tools, code analyzers, etc. Understandi
(Open link)
Your healing creates waves that are consciously and unconsciously felt by others. The vibration or energy that you emit reverberates outward and influences the environment and those in it. Your peace can be felt by other
(Open link)
Well, it’s that time once again. It’s time for a new release of the Go programming language. Go 1.18 in Q1 of 2022 was a major release that featured the long awaited addition of generics to the language and also had lots
(Open link)
In practice, computer code is constantly being transformed. At the beginning of a project, the computer code often takes the form of sketches that are gradually refined. Later, the code can be optimized or corrected, som
(Open link)
I was a wayward kid who grew up on the literary side of life, treating math and science as if they were pustules from the plague. So it’s a little strange how I’ve ended up now—someone who dances daily with triple integr
(Open link)
Software systems are increasingly based on data, rather than code. A new class of tools and technologies have emerged to process data for both analytics and ML.
(Open link)
Negative-utilitarianism challenges the moral symmetry of pleasure and pain. It would justify erasing the world were that the only way to eradicate suffering
(Open link)
In a way, an error message tells a story; and as with every good story, you need to establish some context about its general settings. For an error message, this should tell the recipient what the code in question was tr
(Open link)
It’s hardly insightful to suggest that the last few years have substantially changed the day to day experience of a knowledge worker. Nearly overnight even the most remote skeptical leadership teams were forced to embrac
(Open link)
The Go "functional options" pattern is a way of passing options to a function. The function takes a variable number of arguments, which are themselves functions (a type like ...func(*config). I think it was first introdu
(Open link)
As part of a technology-driven organization, it is certain that the expectations grow exponentially higher every time for long-term sustenance. So, there is always a demand for solutions that offer…
(Open link)
Reading the thought-provoking "Patterns & Abstractions" post reminded me of a long-held opinion I have about programming language design: we have a tendency to keep adding features to a language until it becomes so big
(Open link)
Creative and open-ended tasks, new features to a product, and research tasks seem like they should be simple. You think of a goal, you start your work, and with a lot of effort, you will be…
(Open link)
This article is about a few quick thumb rules I use when writing shell scripts that I’ve come to appreciate over the years. Very opinionated....
(Open link)
Hacker News new | threads | past | comments | ask | show | jobs | submit hanlec (195) | logout The Invention of Free Love (aeon.co) 25 points by pepys 4 hours ago | flag | hide | past | favorite | 8 comments cfigg
(Open link)
In distributed systems, logical clocks play a key role in the ordering of system events. What are the various logical clock designs, and how do they help with event ordering? This article answers these questions.
(Open link)
As software engineers, we're constantly making detailed, elaborate plans for computers to execute. Isn't it weird that we rarely give a moment's thought to the program for our own careers?
(Open link)
I’m rewriting The Business for the next book. This piece is almost 15 years old, much has changed in negotiating an offer letter, and I have more advice on how to analyze those offers. So much advice that I am idea paral
(Open link)
Second-order thinking is a mental model that smart people like Warren Buffett & Howard Marks use to avoid problems. Read this article to learn how it works.
(Open link)
Hacker News new | threads | past | comments | ask | show | jobs | submit hanlec (194) | logout Rethinking Visual Programming with Go (divan.dev) 220 points by techplex 22 hours ago | flag | hide | past | favorite | 56 co
(Open link)
I'm trying some experimental tiers on Patreon to see if I can get to substack-like levels of financial support for this blog without moving to substack! A decade of major cache incidents at Twitter This was co-authored
(Open link)
Agile emerged at a time when companies built and commonly released software in lengthy cycles of 6+ months. Each cycle had an explicit architecture or design phase involving more experienced developers, frequently named
(Open link)
There are too many task management apps to choose from today. I wanted to boil it down to my favorite 3 task managers for macOS! Here they are.
(Open link)
HRV has long been used as an indicator of recovery and readiness in relation to endurance training, but what about correlations with your health?
(Open link)
In this article, Louis Lazaris describes and demonstrates some interesting HTML attributes that you may or may not have heard of and perhaps find useful enough to personally use in one of your projects.
(Open link)
In 2019, to meet growth and availability challenges, we set a plan in motion to improve our tooling and ability to partition relational databases.
(Open link)
In 1952, the Sight and Sound team had the novel idea of asking critics to name the greatest films of all time. The tradition became decennial, increasing in size and prestige as the decades passed. The Sight and So
(Open link)
I introduced Helidon Builder in my previous article, a new tool to help you develop robust fluent builders with just a few easy-to-use annotations. In that article I mentioned how Helidon Builder is…
(Open link)
Bare minimum Monday, the practice of making the first day back at work a low-stress one, is trending. Learn how to do it from career experts.
(Open link)
I was very excited when Apple announced the Apple Watch Ultra this fall. I’ve been wearing an Apple Watch on my wrist nearly every day since they were first released seven years ago. What was so exciting about the Ultra
(Open link)
We know that we have built something which is genuinely useful: almost any team which adopts Slack as their central application for communication would be significantly better off than they were…
(Open link)
Skip to content JOHANNA PIRKER THE BLOG OF JOHANNA PIRKER HLF16: LISKOV’S READING LIST FOR COMPUTER SCIENTISTS by JohannaPosted on23. September 2016 Barbara Liskov pointed us in her talk at Heidelberg L
(Open link)
<p>I’m thrilled that we have hit an exciting milestone the Apache Kafka® community has long been waiting for: we have introduced exactly-once semantics in Kafka in the 0.11 release and […]</p>
(Open link)
Sharma Podila shares from their experience migrating to asynchronous processing at scale, requiring attention to managing data loss, a highly available infrastructure, and elasticity to handle bursts.
(Open link)
Every profession has its pitfalls. Doctors, for example, are always being asked for free medical advice, lawyers are asked for legal information, morticians are told how interesting a profession that must be and then peo
(Open link)
Interested participants will learn what is currently going on in the development of Apache Maven 4.X. The goal is to get an Apache Maven version 4.0.0-alpha-...
(Open link)
Ten years ago, I wanted to use a Lego Mindstorms NXT 2.0 kit to build a robot that could continuously monitor our software development builds.
(Open link)
Design is the art of arranging code to work today, and be changeable forever. — Sandi Metz The last underlying principle I want to highlight is productivity. Developer productivity is a sprawling topic but it boils
(Open link)
Decouple Capability and not Code Whenever developers want to extract a service out of an existing system, they have two ways to go about it: extract code or rewrite capability. Often by default the service extr
(Open link)
I never thought I’d be a watch person. (Or that I’d wear blue shorts.) I didn’t need a watch. I had a phone and a computer to tell me the time. But I got the Apple Watch on day one, because I’m a responsible iPhone app
(Open link)
The layered architecture pattern is one of the most common patterns. The idea behind a Layered pattern is that components with the same functionalities will be organized into horizontal layers. As a…
(Open link)
Hello guys, If you have given any coding interview then you know that System design or Software design problems are an important part of programming job interviews, and if you want to do well, you…
(Open link)
The concept behind frameworks and libraries is to provide reusable code that you can use to perform everyday tasks so that developers don't have to write
(Open link)
In this post, I hope to explore different forms of “testing in production”, when each form of testing is the most beneficial as well as how to test services in production in a safe way. However…
(Open link)
The "Gonzo fist", characterized by two thumbs and four fingers holding a peyote button, was originally used in Hunter S. Thompson's 1970 campaign for sheriff of Pitkin County, Colorado. It has since evolved into a symbol
(Open link)
6.824 Schedule: Spring 2021 TR1-2:30 Here is the tentative schedule of lectures and due dates. The lecture notes and paper questions for future dates are copies from previous years, and may change. The Zoom link for le
(Open link)
Software engineers are attracted to formulas, algorithms, and structures. As people whose job it is to take ideas and turn them into predictable executable code, it is unsurprising that we’re drawn…
(Open link)
There’s been a lot of discussion about platforms recently, I talked about why I think it’s a current hot meme on the WTF Podcast with Charles Humble recently, and Sam Newman just wrote a blog post…
(Open link)
With millions of users and one of the most sophisticated microservice-driven platforms today, Doordash was the perfect testing ground for an automated resilience testing tool developed by Meiklejohn, and members of the C
(Open link)
Gwen Shapira talks about how microservices evolved in the last few years, based on experience gained while working with companies using Apache Kafka to update their application architecture.
(Open link)
VS Code's new merge editor interface gives you the view you've been missing for easily resolving merge conflicts. 00:00 - Intro 00:30 - What is a merge conf...
(Open link)
At the Sundance Film Festival, the Cut hosted a special screening of Nisha Ganatra’s short film, Rise, followed by a panel discussion about entrepreneurship and the entertainment business. Latasha Gillespie, Alia Shawka
(Open link)
Level Up - The Perplexing Platform TeamI've been thinking a lot about platform teams this week. When I work with a CTO at new companies, at some point, the topic
(Open link)
Hacker News new | threads | past | comments | ask | show | jobs | submit hanlec (194) | logout Study Tips from Richard Feynman (piggsboson.medium.com) 162 points by takiwatanga 13 hours ago | flag | hide | past | favorit
(Open link)
Zhuangzi was the gadfly of ancient Chinese philosophy. His paradoxical writings encourage a stance of therapeutic scepticism towards the world.
(Open link)
To capture meaning of words in their vectors, we first need to define the notion of meaning that can be used in practice. For this, let us try to understand how we, humans, get to know which w
(Open link)
Click to view our Accessibility Policy Skip to content Oracle Search Menu Resources for About Careers Developers Investors Partners Startups Why Oracle Analyst Reports Best CRM Cloud Economics Corporate Responsibility Di
(Open link)
I just passed my one-year anniversary of being a Senior Research Director for Database at Gartner and it’s been every bit the provocative experience I anticipated. The role has added an enjoyable, new dimension to my pro
(Open link)
BackgroundSkipLists often come up when discussing “obscure” data-structures but in reality they are not that obscure, in fact many of the production grade softwares actively use them. In this post I’ll try to go into Ski
(Open link)
Hacker News new | threads | past | comments | ask | show | jobs | submit hanlec (199) | logout Bob Metcalfe wins Turing Award (acm.org) 831 points by robbiet480 1 day ago | flag | hide | past | favorite | 229 comments
(Open link)
The Micronaut framework provides a solid foundation for building Cloud Native Java microservices. It reduces the use of Java reflection, runtime proxy generation, and dynamic classloading.
(Open link)
A resilient system continues to operate successfully in the presence of failures. There are many possible failure modes, and each exercises a different aspect of resilience. The system needs to…
(Open link)
This best-practices article intends for developers interested in creating RESTful Web services that provide high reliability and consistency across multiple service suites; following these…
(Open link)
Is subvocalization keeping you from becoming a faster reader? Here's what the science says — and how to actually increase your reading speed.
(Open link)
Some years ago, I was explaining to my manager that I was feeling a bit bored, and they told me to learn how to read a Profit & Loss (P&L) statement. At the time, that sounded suspiciously like, “Stop wasting my time,” b
(Open link)
Occam’s razor is one of the most useful, (yet misunderstood,) models in your mental toolbox to solve problems more quickly and efficiently. Here’s how to use it.
(Open link)
In French, “cultiver son jardin intérieur” means to tend to your internal garden—to take care of your mind. The garden metaphor is particularly apt: taking care of your mind involves cultivating your curiosity (the seeds
(Open link)
By “high growth”, I mean in terms of employee count and roughly doubling or more every year. Even at slower growth rates, some of the phenomena I’ll describe may be relevant. The only option where…
(Open link)
https://media.ccc.de/v/rc3-channels-2020-20-what-have-we-lost-We have ended up in a world where UNIX and Windows have taken over, and most people have never ...
(Open link)
Uber’s backend is an exemplar of microservice architecture. Each microservice is a small, individually deployable program performing a specific business logic (operation). The microservice architecture is a type of distr
(Open link)
JavaOne is back! ➱ https://oracle.com/javaoneSequenced collections introduce an abstraction for collections with a known encounter order like all lists and...
(Open link)
It isn’t exactly news that developers have to write not only code but also tests. Nevertheless, many find this to be burdensome and monotonous work. Plus, it is far from guaranteed that unit tests actually cover all the
(Open link)
What started as lighthearted iconoclasm, poking at the bear of SOLID, has developed into something more concrete and tangible. If I do not think the SOLID principles are useful these days, then what would I replace them
(Open link)
This time of the year always excites us: it brings a look at what’s new in iPhone photography. First up is our brief look at the technical specifications of the new iPhone 14 Pro cameras. Our next post field tests the n
(Open link)
JDK 15 was released on September 15, 2020. JEP 360 Sealed Types was included as a preview feature in this release. Sealed Types is part of Project Amber. Sealed classes or interfaces can be used to…
(Open link)
Another factor, Mr. Stoppelman said, was a crucial decision, unusual at the time, to locate the company in a San Francisco office building instead of a Silicon Valley office park.“I’m not sure that Yelp would have succee
(Open link)
AWS Lake Formation (coming soon) will make it easy to set up a secure data lake in days. With AWS Lake Formation, you will be able to ingest, catalog, clean, transform, and secure your data, and make it available for ana
(Open link)
This blog is written in Asciidoc, built using Hugo, and hosted on GitHub Pages. I recently wanted to share the draft of a post I was writing with someone and ended up exporting a local preview to a PDF - not a great work
(Open link)
If you ask StackOverflow or ChatGPT, how to convert an InputStream to a String in Java, you get archaic constructs with buffered readers and tedious loops. In modern Java, you achieve this task and similar ones with a si
(Open link)
I’ve been reading up on observability over the last three months. In this post I have organized the material into a sort of recommended reading order. It doesn’t reflect the order in which I read it, but I think this ord
(Open link)
"Hunter Thompson" redirects here. For the musician, see Hunter G. K. Thompson. Hunter S. ThompsonThompson at Caesars Palace in Las Vegas in 1971BornHunter Stockton Thompson July 18, 1937 Louisville, Kentucky, U.S.DiedFe
(Open link)
We draw on years of testing to crown the best beginner fountain pen, best gel pen, best pen for note-taking, and more. Each recommendation includes links to related guides so that you can evaluate the competition for yo
(Open link)
Many apps today are actually a front-end for a series of API calls. APIs are necessary to proper functioning of such applications, but if you don’t protect them, bad actors can exfiltrate data, DDoS your servers, or othe
(Open link)
In this blog, you will learn some Docker best practices mainly focussed on Java applications. This is not only a theoretical exercise, but you will learn how to apply the best practices to your Dockerfiles. Enjoy! 1. Int
(Open link)
This story explores some concepts in computer networking, inspired by Michael Nielsen’s idea of discovery fiction. Code samples can also be found in this repo. Excerpts use openbsd-flavoured netcat on Debian Linux; behvi
(Open link)
Hacker News new | threads | past | comments | ask | show | jobs | submit hanlec (199) | logout Guide to Software Architecture Documentation (workingsoftware.dev) 201 points by kiyanwang 11 hours ago | flag | hide | past
(Open link)
Sample code and instructions for steps through different container image build options. - GitHub - maeddes/options-galore-container-build: Sample code and instructions for steps through different container image build op
(Open link)
Hacker News new | threads | past | comments | ask | show | jobs | submit hanlec (195) | logout Be Less Technical (sequential.dev) 141 points by 7237139812 8 days ago | flag | hide | past | favorite | 69 comments r
(Open link)
202 votes, 150 comments. 282k members in the shortcuts community. This subreddit is devoted to Shortcuts. Shortcuts is an Apple app for automation …
(Open link)
This is a simple checklist, and while it is useful to any software engineer, it is especially useful to senior engineers. More items from the list can be found here.
(Open link)
There is something delightful about riding a bicycle. Once mastered, the simple action of pedaling to move forward and turning the handlebars to steer makes bike riding an effortless activity. In the demonstration below,
(Open link)
There are so many brilliant posts on GPT-3, demonstrating what it can do, pondering its consequences, vizualizing how it works. With all these out there, it still took a crawl through several papers
(Open link)
The C4 model consists of a hierarchical set of software architecture diagrams. These diagrams are both easy to create and easy to understand for multiple intended audiences.
(Open link)
Hello film friends, We know, we know. We can’t believe we are ripping through August either. But we are barreling headlong into fall festival season, bub, so get used to it. Luckily, there’s an inordinately large number
(Open link)
This is the story of Simon Wardley. Follow his journey from bumbling and confused CEO lost in the headlights of change to someone with a vague idea of what they're doing.
(Open link)
With billing, the devil is in the details. In this article, Raffi provides a high-level view of the technical challenges we faced while implementing a hybrid pricing (based on both 'subscription' and 'usage') at 5x unico
(Open link)
SSH port forwarding explained in a clean and visual way. How to use local and remote port forwarding. What sshd settings may need to be adjusted. How to memorize the right flags.
(Open link)
Code review is often a pain point, so I end up talking to most clients about it. Here are some quick fixes for common problems I often end up suggesting to them.
(Open link)
Hacker News new | threads | past | comments | ask | show | jobs | submit hanlec (195) | logout Evidence that life flashes before the eyes upon death (hyperallergic.com) 226 points by relaunched 8 days ago | flag | hide |
(Open link)
[This fragment is available in an audio version.] I’ve enjoyed writing software for 40+ years now. Lots of activities fall into that “writing software” basket, and here’s my favorite: When you have a body of code wi
(Open link)
Feature flags allows you to ship code more frequently, test on production, and wow your users by revealing the feature at the right moment.
(Open link)
Sharon Salzberg, Judith Simmer-Brown, John Tarrant, and Dzogchen Ponlop Rinpoche discuss skillful and unskillful involvement with emotions, offering new perspectives on how to think about and engage with our emotional li
(Open link)
Given thoughtfully and at the right time, constructive criticism is extremely valuable. These tips can ensure the criticism you give is actually constructive.
(Open link)
Did someone say … cookies? Twitter and its partners use cookies to provide you with a better, safer and faster service and to support our business. Some cookies are necessary to use our services, improve our services, an
(Open link)
Hacker News new | past | comments | ask | show | jobs | submit login How to put machine learning models into production (stackoverflow.blog) 120 points by Aaronmacaron 8 days ago | hide | past | favorite | 34 comments
(Open link)
One of the most important things I learned from running a startup is that on a macro scale the innovation market is efficient. If the market conditions allow for a startup to arise, it’s overwhelmingly probable that mult
(Open link)
For many people, anchovies are one of those foods to be avoided like the plague. But for Ken Gargett anchovies are not a love-it-or-hate it food. Rather, they are a love-it-or-you-have-not-discovered-how-good-they-can-be
(Open link)
Hacker News new | threads | past | comments | ask | show | jobs | submit hanlec (195) | logout Life is not short (dkb.show) 298 points by dbrereton 4 hours ago | flag | hide | past | favorite | 190 comments throwa
(Open link)
I have begun to notice a common blind spot among engineers who are frustrated about leveling up, which is: not every opportunity exists at every company at every time
(Open link)
All software applications are composed of re-usable elements. The objective and functionality of these reusable elements vary from infra level concern to security level concern to business…
(Open link)
This article covers the fundamentals of Kubernetes pods, from how they function to how they are created and terminated. It also provides advice on how to deal with common problems.
(Open link)
In the years after World War I, longtime Army colleagues and friends George S. Patton and Dwight D. Eisenhower contemplated what would happen if another global conflict broke out. As Patton envisioned it: “In the next wa
(Open link)
Software projects are displayed by Gource as an animated tree with the root directory of the project at its centre. Directories appear as branches with files as leaves. Developers can be seen working on the tree at the t
(Open link)
Since we started PostHog, our team has interviewed 725 people. What's one thing I've taken from this? It's normal for candidates not to ask harder…
(Open link)
In this post, understand the different concepts of consistency as applied to distributed databases, as well as some issues with the conversation of consistency.
(Open link)
In a way, an error message tells a story; and as with every good story, you need to establish some context about its general settings. For an error message, this should tell the recipient what the code in question was tr
(Open link)
In this article, we dive into the possibilities of mechanical keyboards. The different layouts, switch types and even keycap material. Strap yourself in — this will be a deep dive!
(Open link)
Research suggests there is some heritable component of cognitive characteristics in dogs, which could have important implications for understanding the capabilities of different breeds.
(Open link)
Improve Java application performance by choosing the best garbage collector for your application's throughput, latency, and footprint requirements.
(Open link)
Hacker News new | threads | past | comments | ask | show | jobs | submit hanlec (195) | logout Carefully exploring Rust as a Python developer (karimjedda.com) 118 points by EntICOnc 10 hours ago | flag | hide | past | fa
(Open link)
Are you looking for a new approach to health? Do you want to finally get the results you have been hoping for? How do you find a practitioner that is willing to try a different approach and guide you through your journey
(Open link)
Data scientists excel at creating models that represent and predict real-world data, but effectively deploying machine learning models is more of an art than science. Deployment requires skills more commonly found in sof
(Open link)
A while back, I wrote about the fact that logs need an overhaul, and that practices that were relevant when logs were still text messages in files may no longer be relevant in an age when logs traces…
(Open link)
When hiring developers, there are many things we are looking for, but over the years I have found that raw coding ability is easily the most important quality to look for. I can quickly train a person to have knowledge
(Open link)
In this post I want to write about probably the most powerful text editor there is: Emacs and how to set it up so you can begin programming and live coding with Overtone in no time. Many users would disagree and argue th
(Open link)
Viorel Spînu My technical expertise Thought to put together some of the technologies I've been used so far. Came up below timeline. Seems the last few years I had a lot of free time :) Click to enlarge. BACKEND
(Open link)
The R/H/E box is outdated and largely unhelpful, yet it appears prominently on every major league scoreboard and TV broadcast. Here's how it got that way.
(Open link)
Today is my birthday. I turn 70. I’ve learned a few things so far that might be helpful to others. For the past few years, I’ve jotted down bits of unsolicited advice each year and much to my surprise I have more to add
(Open link)
2022 February 01 16:21 stuartscott 1473754¤ 1240149¤ You may have noticed that the January edition of the Convey Digest looks a little different from the previous ones - the color scheme is now based on the dominant
(Open link)
Hacker News new | threads | past | comments | ask | show | jobs | submit hanlec (195) | logout How to write more clearly, think more clearly, and learn complex material [pdf] (covingtoninnovations.com) 775 points by Secr
(Open link)
Credit: Andrij Borys Associates; Renee French (CC by 3.0) Go is a programming language created at Google in late 2007 and released as open source in November 2009. Since then, it has operated as a public project, wi
(Open link)
Skip to main content Online education by Particular Software Toggle menu MENU Welcome Back! EMAIL PASSWORD REMEMBER ME Forgot Password? If you are a human, ignore this field SIGN IN Create a new account HOME SUPPOR
(Open link)
Find out about the best ways to write subject lines for your cold sales emails to improve your open and response rates! This post is here to help you craft a subject line that appeals to your customers & is relevant to y
(Open link)
Plan on a Page Checklist A brief mission statement and vision for your team. What does your team do? What is your mission? What gets you up in the morning—eager to help our customers, company, and community? What is your
(Open link)
I’ve observed thousands of founders and thought a lot about what it takes to make a huge amount of money or to create something important. Usually, people start off wanting the former and end up...
(Open link)
Table of Contents Intro Why Flagger Prerequisites NGINX (Ingress) Canary Deployment Try it out Linkerd (ServiceMesh) Canary Deployment with Ingress support Try it out Summary Intro Lately, I’ve been checkin
(Open link)
Much of expertise is tacit: that is, it cannot be captured through words alone. We look at techniques to learn the tacit knowledge of experts.
(Open link)
Does this Go snippet compile, and if not, why? package main func main() { users := []User{User{"alice"}, User{"bob"}} var _ Named = User{"charlie"} getName(users[0]) getNames(users) } type Named
(Open link)
Donald Knuth’s Annual Christmas Tree Lectures are back! 🌲✨ This year Dr. Knuth will present on Twintrees, Baxter Permutations, and Floorplans. Three fascina...
(Open link)
Authorization? How hard can it be? I am pretty sure that others have already solved it. We are not the first ones doing microservices. It should be easy to integrate what's already out there.
(Open link)
“Doc comments” are comments that appear immediately before top-level package, const, func, type, and var declarations with no intervening newlines. Every exported (capitalized) name should have a doc comment. The go/doc
(Open link)
SpaceWall wallpapers. For this week’s special issue of MacStories Weekly to celebrate Week 2 of Automation April, I dusted off an old shortcut of mine and updated it for the modern era of Shortcuts automations and the ab
(Open link)
Writing tests is common practice these days. How else would you ensure that the code does what you expect? However, some software is business-critical and simply testing a few examples is not enough. The usual workfl
(Open link)
The Prime Video team published this story: Scaling up the audio/video monitoring service and reducing costs by 90%, and the internet piled in with opinions and bad takes, mostly missing the point…
(Open link)
In 2008, showbiz satire 30 Rock aired an episode called “MILF Island.” Poking fun at how tawdry reality television can be, it follows network honcho Jack Donaghy (Alec Baldwin) as he gloats about his newest hit, a realit
(Open link)
Looking for some descriptive words for music? Is that types of music, sounds of music or effects of music? We have examples of them all, from folk to funk and tempo to timbre.
(Open link)
Ready to switch from your Apple Watch Series 7 to the new Series 8 or SE? Or maybe you just need to transfer your existing Apple Watch with your new iPhone? Follow along for how to seamlessly pair a new or existing Apple
(Open link)
Golang is a popular programming language with powerful features. However, one can find it complicated to organize a large codebase while juggling dependencies. Go developers sometimes have to pass…
(Open link)
As long as it’s more than one person AND it’s important that we achieve some collective goal THEN we need some way of facilitating coherent action across an organisation. Without coherence, you’ll…
(Open link)
After needing to do a deep dive on the venv module (which I will explain later in this blog post as to why), I thought I would explain how virtual environments work to help demystify them.Back in my the day, there was no
(Open link)
What happens when your distributed service has challenges with stampeding herds of internal requests? How do you prevent cascading failures between internal services? How might you re-architect your workflows when naive
(Open link)
Hacker News new | threads | past | comments | ask | show | jobs | submit hanlec (199) | logout Jazz Is Freedom (thebaffler.com) 93 points by ignored 14 hours ago | flag | hide | past | favorite | 56 comments navan
(Open link)
Learn how to choose carabiners for rock climbing, and the benefits of locking, nonlocking, wiregate, bent-gate and straight-gate carabiners.
(Open link)
There is no bar for the quality of a blog post. Allow me to be an example. See… every blog post on this entire site. I’d like to write better individual blog posts, but something has always compelled me to punt out a tho
(Open link)
The blog post will share the four phases of Real-time Data Infrastructure’s iterative journey in Netflix (2015-2021). For each phase, we will go over the evolving business motivations, the team’s unique challenges, the
(Open link)
Hacker News new | threads | past | comments | ask | show | jobs | submit hanlec (195) | logout Build the Modular Monolith First (fearofoblivion.com) 64 points by kiyanwang 4 hours ago | flag | hide | past | favorite | 39
(Open link)
In this post we are going to take two fascinating topics, the Monty Hall Problem and the Narrative Fallacy, and see what they can teach us about product planning. Enjoy! 1. Let’s Make a Deal A lot has been written about
(Open link)
This is Part One (of two) of our story chronicling Twitch’s journey from monolithic architecture to microservices. In Part One, you’ll learn about our early days, from our rapid growth to the perfo...
(Open link)
The AWS Well-Architected Framework defines resilience as “the capability to recover when stressed by load (more requests for service), attacks (either accidental through a bug, or deliberate through intention), and failu
(Open link)
Hacker News new | threads | past | comments | ask | show | jobs | submit hanlec (195) | logout Build Your Own Text Editor (2017) (viewsourcecode.org) 167 points by Tomte 10 hours ago | flag | hide | past | favorite | 18
(Open link)
I think the flow of time is not part of the fundamental structure of reality,” theoretical physicist Carlo Rovelli tells me. He is currently working on a theory of quantum gravity in which the variable of time plays no p
(Open link)
An overview of how the InfoQ editorial team sees the Software Architecture and Design topic evolving in 2022, with a focus on what architects are designing for today.
(Open link)
A common situation we all face during a project is when we should compare two alternatives, for example, the most common for me is during performance analysis. Usual situations could range from evaluating the improvement
(Open link)
This post is written by Luca Mezzalira, Principal Specialist Solutions Architect. Today, AWS is launching a preview of AWS Application Composer, a visual designer that you can use to build your serverless applica
(Open link)
Hacker News new | threads | past | comments | ask | show | jobs | submit hanlec (199) | logout Dotfiles Management (mitxela.com) 266 points by threeme3 16 hours ago | flag | hide | past | favorite | 112 comments p
(Open link)
Hacker News new | past | comments | ask | show | jobs | submit login How to do hard things (every.to/no-small-plans) 391 points by tacon 23 hours ago | hide | past | favorite | 127 comments ChrisMarshallNY 22 hour
(Open link)
Jeremy wrote a little something about streams, in particular about streams on personal websites. His home page actually is like a stream: links, notes, and blog posts all appear underneath each other in chronological ord
(Open link)
Mercurial 5.2 was released on November 5, 2019. It is the first version of Mercurial that supports Python 3. This milestone comes nearly 11 years after Python 3.0 was first released on December 3, 2008. Speaking as a mai
(Open link)
Don’t miss what’s happening People on Twitter are the first to know. Log in Sign up Did someone say … cookies? Twitter and its partners use cookies to provide you with a better, safer and faster service and to support ou
(Open link)
Imagine you are buying a car. What essential features do you need in it? A vehicle should deliver a person from point A to point B. But what we also check in it is Safety, Comfort, Maintainability…
(Open link)
Why is it so frickin hard to provide people with valuable feedback, let alone getting meaningful feedback from others? One of two things tends to be the case: we either don’t know how to ask for good feedback, or we don’
(Open link)
Learn how packets flow inside and outside a Kubernetes cluster. Starting from the initial web request and down to the container hosting the application
(Open link)
This article will try to decode these technologies and explore how developers should consider containers or serverless functions within their tech stack.
(Open link)
Gwen Shapira talks about how microservices evolved in the last few years, based on experience gained while working with companies using Apache Kafka to update their application architecture.
(Open link)
Hacker News new | past | comments | ask | show | jobs | submit login Infrastructure SaaS – a control plane first architecture (thenile.dev) 56 points by infra_dev 4 hours ago | hide | past | favorite | 16 comments
(Open link)
This is a four day Rust course developed by the Android team. The course covers the full spectrum of Rust, from basic syntax to advanced topics like generics and error handling. It also includes Android-specific content
(Open link)
An unlikely customer anecdote and some simple bloom filters helped a Principal Engineer solve a duplicate problem on the Prime Video homepage.
(Open link)
Seth Godin, the world-renowned marketing and leadership author inspires us on how to get our ideas spread when mass marketing and traditional advertising hav...
(Open link)
If you use retrospectives, or any kind of meeting where people are supposed to discuss and learn from their discussions, you will have experienced less efficient sessions from time to time. There is no wonder
(Open link)
I first discovered sarcasm as a freshman in college, which I realize makes me a bit of a late bloomer as far as teenagers go. There were certain classmates who seemed to always come across as clever and funny no matter t
(Open link)
Recently, while watching a lecture by Niall Ferguson, I learned a new concept with an unfamiliar name: counterfactual history. The human mind tends to find causal narrative in everything, and we tend…
(Open link)
Generative Pre-trained Transformer models (GPTs) are now all the rage and have inspired op-eds being written by everyone from Henry Kissinger (WSJ) to Noam Chomsky (NYTimes) in just the last month. That sure is some hype
(Open link)